Semi-Supervised Learning of the Electronic Health Record with Denoising Autoencoders for Phenotype Stratification

نویسندگان

  • Brett K. Beaulieu-Jones
  • Casey S. Greene
چکیده

Patient interactions with health care providers result in entries to electronic health records (EHRs). EHRs were built for clinical and billing purposes but contain many data points about an individual. Mining these records provides opportunities to extract electronic phenotypes that can be paired with genetic data to identify genes underlying common human diseases. This task remains challenging: high quality phenotyping is costly and requires physician review; many fields in the records are sparsely filled; and our definitions of diseases are continuing to improve over time. Here we develop and evaluate a semi-supervised learning method for EHR phenotype extraction using denoising autoencoders for phenotype stratification. By combining denoising autoencoders with random forests we find classification improvements across simulation models, particularly in cases where only a small number of patients have high quality phenotype. This situation is commonly encountered in research with EHRs. Denoising autoencoders perform dimensionality reduction allowing visualization and clustering for the discovery of new subtypes of disease. This method represents a promising approach to clarify disease subtypes and improve genotype-phenotype association studies that leverage EHRs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-supervised learning of the electronic health record for phenotype stratification

Patient interactions with health care providers result in entries to electronic health records (EHRs). EHRs were built for clinical and billing purposes but contain many data points about an individual. Mining these records provides opportunities to extract electronic phenotypes, which can be paired with genetic data to identify genes underlying common human diseases. This task remains challeng...

متن کامل

Deconstructing the Ladder Network Architecture

The manual labeling of data is and will remain a costly endeavor. For this reason, semi-supervised learning remains a topic of practical importance. The recently proposed Ladder Network is one such approach that has proven to be very successful. In addition to the supervised objective, the Ladder Network also adds an unsupervised objective corresponding to the reconstruction costs of a stack of...

متن کامل

Online Semi-Supervised Learning with Deep Hybrid Boltzmann Machines and Denoising Autoencoders

Two novel deep hybrid architectures, the Deep Hybrid Boltzmann Machine and the Deep Hybrid Denoising Auto-encoder, are proposed for handling semisupervised learning problems. The models combine experts that model relevant distributions at different levels of abstraction to improve overall predictive performance on discriminative tasks. Theoretical motivations and algorithms for joint learning f...

متن کامل

Denoising Adversarial Autoencoders: Classifying Skin Lesions Using Limited Labelled Training Data

We propose a novel deep learning model for classifying medical images in the setting where there is a large amount of unlabelled medical data available, but labelled data is in limited supply. We consider the specific case of classifying skin lesions as either malignant or benign. In this setting, the proposed approach – the semi-supervised, denoising adversarial autoencoder – is able to utilis...

متن کامل

Semi-supervised Learning using Denoising Autoencoders for Brain Lesion Detection and Segmentation

The work explores the use of denoising autoencoders (DAEs) for brain lesion detection, segmentation, and false-positive reduction. Stacked denoising autoencoders (SDAEs) were pretrained using a large number of unlabeled patient volumes and fine-tuned with patches drawn from a limited number of patients ([Formula: see text], 40, 65). The results show negligible loss in performance even when SDAE...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016